Introduction

Streamlit is a Python package that allows users to create an interactive web app. Streamlit web apps are great platforms for interactive data visualizations in storyboard format. The apps are created from Python scripts that are run from top to bottom.

This tutorial will outline the basics of creating a Streamlit app and features that allow for the display and generation of interactive visualizations. The tutorial will specifically explore Streamlit’s integrations with the Python plotting libraries Matplotlib/Seaborn, Altair, and Plotly.

Demo Context:

For this demo, we will be creating an app focused on the Low Income Home Energy Assistance Program (LIHEAP), which is operated by the U.S. federal government.

The full Python script for the demo app can be found here: https://github.com/carolineeadams/LIHEAP

The fully rendered Streamlit app can be found here: https://carolineeadams-liheap-streamlit-tutorial-50cudv.streamlit.app/

The Low Income Home Energy Assistance Program (LIHEAP) has been administered by the Department of Health and Human Services (HHS) since 1981 to provide low-income households with financial assistance to cover energy bills (heating, cooling) and weatherization of their homes. LIHEAP is a program highlighted under the Biden Administration’s Justice40 Initiative, which pledges to allocate 40 percent of federal investments in climate-tangential areas to communities historically overburdened by pollution and the negative effects of climate change.

The LIHEAP program offers financial assistance to households in need of support with energy (heat and cooling) and weatherization expenses. Program leaders anticipate that due to climate change, the program’s role in supporting families with their cooling expenses will increase substantially over the next few decades and grow to be a larger proportion of the program’s main focus. The LIHEAP program provides benefits to households with low incomes, older adults, young children, and people with disabilities. These populations match some of the communities that the the Environmental Protection Agency (EPA) has indicated are placed at highest risk of experiencing the negative effects of extreme heat due to climate change.

This demo app retrieves publicly available data from the 2019 American Community Survey 5-Year Estimates related to the populations described above that are at highest risk of experiencing the negative effects of extreme heat. Variables can be explored in a variety of ways to understand variation across the United States in terms of proportions of populations in each state. An average vulnerability index is also calculated based on the available data to demonstrate which states have the highest proportions of populations vulnerable to extreme heat.

Installation

To install Streamlit, you will need a recent version of Python, the Anaconda Navigator, an integrated development environment (IDE), and PIP. PIP is usually automatically installed with Python.

For Windows, using Anaconda’s terminal:

To install Anaconda: https://docs.anaconda.com/anaconda/install/windows/

Install the streamlit package:

pip install streamlit

To confirm that the installation worked, the following code should run and open a tab in your web browser:

streamlit hello

For Macs:

To install pip: https://pip.pypa.io/en/stable/installation/#supported-methods

Install Xcode command line tools:

xcode-select --install

Install Streamlit using the command line:

pip install streamlit

Confirm that the installation was successful by running:

streamlit hello

Create an App

Begin by starting a new Python script. Our script will be named Streamlit_Tutorial.py

Open your script and import the streamlit package as well as any other necessary packages.

#importing the streamlit package
import streamlit as st


#optional additional packagees for this demo
#Package to request data through use of an API
import requests

#Data manipulation packages
import pandas as pd
import numpy as np

#Data visualization packages
import geopandas as gpd
import altair as alt
import seaborn as sns
import matplotlib.pyplot as plt
import plotly.express as px

Running App

Streamlit apps are run by entering streamlit run followed by your .py file name in the command line. The app will be rendered in your default web browser based on the last saved version of the script file.

streamlit run Streamlit_Tutorial.py

Writing Script

Streamlit apps are built from python scripts that run from top to bottom. That means that the structure of the script will determine how the app content is ordered.

Streamlit apps can be a mix of text, data visualizations, and interactive widgets. Adding text into the app can be helpful to provide user’s with important context for the displayed data visualizations.

Title

To add a title, enter a string of your choice inside of st.title("enter string here")

Headers and Subheaders

Similarly, headers and subheader can be added using st.header("enter string here") and st.subheader("enter string here")

Example

st.title("Communities Placed At Highest Risk of the Negative Effects of Extreme Heat from Climate Change")
st.subheader("Background")

General Text

General text can be added with: st.write("enter text here")

Example

st.write("The Low Income Home Energy Assistance Program (LIHEAP) has been administered by the Department of Health and Human Services (HHS) since 1981 to provide low-income households with financial assistance to cover energy bills (heating, cooling) and weatherization of their homes. LIHEAP is a program highlighted under the Biden Administration's Justice40 Initiative, which pledges to allocate 40 percent of federal investments in climate-tangential areas to communities historically overburdened by pollution and the negative effects of climate change.")

Start a new paragraph by adding a new st.write() on a new line

st.write("The Low Income Home Energy Assistance Program (LIHEAP) has been administered by the Department of Health and Human Services (HHS) since 1981 to provide low-income households with financial assistance to cover energy bills (heating, cooling) and weatherization of their homes. LIHEAP is a program highlighted under the Biden Administration's Justice40 Initiative, which pledges to allocate 40 percent of federal investments in climate-tangential areas to communities historically overburdened by pollution and the negative effects of climate change.")

st.write("The LIHEAP program offers financial assistance to households in need of support with energy (heat and cooling) and weatherization expenses. Program leaders anticipate that due to climate change, the program's role in supporting families with their cooling expenses will increase substantially over the next few decades and grow to be a larger proportion of the program's main focus. The LIHEAP program provides benefits to households with low incomes, older adults, young children, and people with disabilities. These populations match some of the communities that the the Environmental Protection Agency (EPA) has indicated are placed at highest risk of experiencing the negative effects of extreme heat due to climate change.")

The st.write() function can be used to display other types of information, in addition to general text.

st.write() can be used to write other arguments to the app. For example, someone could use st.write() to display a figure, variable, or dataframe that was defined in the script.

Obtaining Data

To use data in the app, data must be loaded into the script. If taking the form of a dataframe, it must be saved to an object. For interactive use of dataframes (explored later in this demo), a function must be set and called that finds the data in the source location and saves it into an object for the app to later use.

Example

In this demo, I am retrieving data from the American Community Survey (ACS) produced by the U.S. Census Bureau. The ACS has population estimates of demographic and other characteristics of individuals and households across the United States. For this app, I will be collecting variables from the ACS relevant to factors that increase one’s vulnerability to extreme heat and eligibility for the LIHEAP program.

To get the data, I am using the Census API to obtain the variables I want and am merging the data with geographic shapefiles for mapping purposes.

Note: The code for making these API calls and cleaning the data is very long and will not be shown in this tutorial. To find the full code, please access the Python script for the demo app here: https://github.com/carolineeadams/LIHEAP

The two most important lines of the longer Python script are included in the cell directly below. These two lines demonstrate how this data is obtained through a function call and the results are saved to an object that can be used by the Streamlit app.

#run first function to obtain and clean Census data, and save to python object, census_df_lmtd
census_df_lmtd=clean_dta(census_df)

#run second function to obtain geographic shapefile data, and save to an object, us51
us51=get_geo_dta()

Now that you have imported the package, created your script, and generated or obtained your data, you can begin to experiment with data visualizations.

Integration with Plotting Packages

A variety of plotting packages can be used with Streamlit apps. In this demo, we will review integrations with Matplotlib/Seaborn, Altair, and Plotly. However, Streamlit has its own functionalities that will actually create simple plots based on dataframes for you.

Matplotlib & Seaborn

Figures generated through the Matplotlib.pyplot package can be displayed in streamlit apps. To show a figure, the function st.pyplot() must be used.

Plots generated using the package Seaborn also rely on the st.pyplot() function, as Seaborn is built on top of Matplotlib.

Example

In the following example, a histogram of the proportion of individuals over the age of 65 by state across the United States is generated using Seaborn. The figure is defined and designed in the usual fashion and saved to an object fig. fig is then passed to the function st.pyplot(fig) to ensure the plot is displayed in the rendered app.

#save one feature of the Census data to x1
x1=census_df_lmtd['Percentage of Individuals Ages 65 and Over']
#fill NA values with the mean for the country
x1=x1.fillna(x1.mean())
#create histogram of the x1 variable
fig = sns.displot(census_df_lmtd, x=x1, color = "#FF6347")

#run streamlit plotting function and display plot in app
st.pyplot(fig)

Altair

Altair is another plotting package that is compatible with streamlit apps. Figures can be defined normally in the Python script according to the package features and saved to a Python object. To display altair-based plots in Streamlit apps, the function st.altair_chart() should be used, with the Python object passed into the function.

While not covered in this demo, the Altair library allows for creation of many interactive features within the Streamlit app. For example, you can generate of layered charts, which include annotations of text, images, and even emojis. Tool tips can also be created.

Example

In this example, an ordered bar chart showing the number of individuals in each state with incomes below 150% of the federal poverty level is created using Altair. The plot is defined and saved to the object, alt_plot, which is then passed into the function st.altair_chart().

#generating altair bar chart
alt_plot = alt.Chart(census_df_lmtd).mark_bar(color="#4B0082").\
encode(alt.X("NAME", sort="-y"),
        y='Individuals with Incomes Below 150% FPL')

#display chart in streamlit app
st.altair_chart(alt_plot)

Plotly

Plotly is an interactive charting Python library that produces high quality visualizations. A variety of plots can be generated using this library. To show a figure, the function st.plotly_chart() must be used in the Python script.

Example

For this example, I have used plotly to create an interactive map of the United States that shows the number of individuals in each state with a disability. The map is saved to an object, fig, which is then passed to the function st.plotly_chart().

#generating map and saving plot to object, "fig"
fig = px.choropleth(df,locations='State', color='Total Number of Individuals with Disabilities',
                           color_continuous_scale="pinkyl",
                           #range_color=(0, 0.4),
                           hover_name='NAME',
                           locationmode='USA-states',
                           scope="usa",
                           labels={var:'Total Number of Individuals with Disabilities'}
                          )

#showing figure in streamlit app
st.plotly_chart(fig)

Plotly maps come with a variety of interactive features in Streamlit apps. When hovering over the plot, a menu bar of icons shows up in the upper right corner. Users have the option to download the plot as a .png file, zoom in and out, and expand the size of the map.

In addition, users can hover over different regions of the map and information boxes appear. These boxes show the name of the region (in this map’s case, the name of the U.S. state) as well as the value of the variable that is mapped.

For other plotting packages, users have the option to expand all plots that are rendered in the app.

Widgets

In addition to just displaying a mix of plots and text on the apps, widgets can be added to allow visitors to interact with the visualizations.

Checkbox

Checkboxes can be added for additional user interaction. In terms of data visualization, similar to select boxes, checkboxes can be used to allow users to select what content to explore.

To add a checkbox, use the function st.checkbox(). You can set the checkbox function to an object that will allow the box to show up when the app is rendered, or place the checkbox function into in if statement to tell the script what to do if the person checks the box.

Example

In the following example, the checkbox function is placed into an if statement that tells the script to display a dataframe if the box is checked.

#adding subheader to describe data displayed by checkbox in streamlit app
st.subheader('Raw Data from the American Community Survey 5-Year Estimates (2019)')

#allow users to explore raw data if they check a box
if st.checkbox("Explore Raw Data"):
    st.dataframe(census_df_lmtd)

If the box is checked, an interactive dataframe appears. Users can scroll through and sort the rows and columns.

Caching

When the features in the app are reliant on large datasets or datasets that are built from requests or API calls, it can be very time consuming for the app to re-run and regenerate different features. For example, if the app has to rerun the script and perform a request to load data from the internet each time a user switches which variable is selected in a widget, the app may take too long to produce the content to be a useful tool.

To optimize app performance with large data or complicated data sources, users can place @st.cache in front of functions that generate the data.

@st.cache tells the app to check the input parameters of the function, the value of any external variable used, the body of the function, and the body of any function used inside the cached function. If the app is seeing any of those four components for the first time, it will run the entire function and store the results in a cache. In the future, if the components remain the same, the app will not re-execute the function and will instead use the information stored in the cache to display the results.

@st.cache can be used for the data generation itself and also for the generation of plots and figures in the app. It can significantly reduce the time it takes for an app to load for the first time and load new visualizations based on a user’s selections.

Data Generation Example

The following example demonstrates how a function can be defined to obtain data of interest. The data that is obtained is the Census shapefile, which is merged with the original dataframe so that maps can be generated using the geographic indicators included in the shapefile. By adding @st.cache in front of the function, the Streamlit apps knows to locally cache the data the first time this function is called.

@st.cache
#define function to get geographic shapefile data
def get_geo_dta():
    #set file path to shapefile document
    path = "Data/tl_2021_us_state.shp"
    #read in geopandas shapefile
    df = gpd.read_file(path)
    #save shapefile to dataframe
    df = df.to_crs("EPSG:4326")
    #drop state name column in census_lmtd df
    census_df_lmtd2=census_df_lmtd.drop(columns=['NAME'])
    #merge census_df_lmtd onto shapefile
    geo_dta = df.merge(census_df_lmtd2,on='STATEFP')
    #renaming FIPS code column to match geo_data columns
    geo_dta.rename(columns = {'STUSPS':'State'}, inplace = True)
    #removing non states
    non_states = ['VI','MP','GU','AS','PR']
    #saving another copy of geo_dta
    us51 = geo_dta
    #removing non states
    for n in non_states:
        us51 = us51[us51.State != n]
    #returning merged and cleaned dataframe with geographic data
    return us51

Plotting Example

The following example uses the plotting function we generated to produce histograms using Seaborn that change based on the variable that users select in the selectbox.

By adding @st.cache right before the function is defined, the app knows to locally cache the histogram that is produced the first time it sees the function being run for each variable in the selectbox.

Note: adding allow_output_mutation=True inside of @st.cache suppresses a warning that is outside the scope of this demo.

#define function to plot histograms of nationwide distribution of vulnerable population
@st.cache(allow_output_mutation=True)
def plot_hist(var):
    x1=census_df_lmtd[var]
    x1=x1.fillna(x1.mean())
    fig = sns.displot(census_df_lmtd, x=x1, color = "#FF6347")
    return fig

#run function and display plot
st.pyplot(plot_hist(variable_selected))

Final Notes

The app is now ready to run! As noted earlier, to explore the full code, underlying data, and the rendered app, use the following links:

Streamlit has a variety of other features that can be added or adjusted. This demo explored some of the most important features for building an interactive web app to show off data visualizations that are complemented by text. The ordering of Streamlit app and integration of text and figures allows for users to create an informative platform that tells a story and engages users through its interactive design.